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(54) Load distribution among servers in a TCP/IP network 



(57) IVIethods and apparatus for hosting a network 
service on a cluster of servers, each including a prinnary 
and a secondary Internet Protocol (IP) address. A com- 
mon cluster address Is assigned as the secondary ad- 
dress to each of the servers in the cluster. The cluster 
address may be assigned in UNIX-based servers using 
an ifconfig alias option, and may be a ghost IP address 
that is not used as a primary address by any server in 
the cluster. Client requests directed to the cluster ad- 
dress are dispatched such that only one of the senders 
of the cluster responds to a given client request. The 
dispatching may use a routing-based technique, in 



which all client requests directed to the cluster address 
are routed to a dispatcher connected to the local net- 
work of the server cluster The dispatcher then applies 
a hash function to the client IP address in order to select 
one of the servers to process the request. The dispatch- 
ing may alternatively use a broadcast-based technique, 
in which a router broadcasts client requests having the 
cluster address to all of the servers of the cluster over 
a local network. The servers then each provide a filtering 
routine, which may involve comparing a server identifier 
with a hash value generated from a client address, in 
order to ensure that only one server responds to each 
request broadcast by the router. 
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Description 

Field of the Invention 

The present invention relates generally to data 
communication networks such as the Internet and more 
particularly to techniques for hosting network sen/ices 
on a cluster of servers used to deliver data over a net- 
work in response to client requests, where the cluster of 
servers can be collectively identified by a client using a 
single-address image. 

Background of the Invention 

With the explosive growth of the World Wide Web, 
many popular Internet web sites are heavily loaded with 
client requests. For example, it has been reported in S. 
L. Garfinkel. "The Wizard of Netscape." Webserver 
Magazine. July/August 1 995. pp. 59-63. that home pag- 
es of Netscape Communications receive more than 80 
million client requests or "hits" per day. A single server 
hosting a service is usually not sufficient to handle this 
type of aggressive growth. As a result, clients may ex- 
perience slow response times and may be unable to ac- 
cess certain web sites. Upgrading the servers to more 
powerful machines may not always be cost-effective. 
Another common approach involves deploying a set of 
machines, also known as a cluster, and configuring the 
machines to work together to host a single service. Such 
a server cluster should preferably publicize only one 
server name for the entire cluster so that any configura- 
tion change inside the cluster does not affect client ap- 
plications. The World Wide Web and other portions of 
the Internet utilize an application-level protocol, known 
as the Hypertext Transfer Protocol (HTTP), which is 
based on a client/server architecture. The HTTP proto- 
col is described in greater detail in "Hypertext Transfer 
Protocol " HTTP/1.0," Network Working Group. May 
1996. <http://www.lcs.uci.edu/pub/letf/http>, which is 
incorporated by reference herein. 

FIG. 1 illustrates an exemplary client/server archi- 
tecture suitable for implementing HTTP-based network 
services on the Internet. A client 12 generates an HTTP 
request for a particular service, such as a request for 
information associated with a particular web site, and a 
Transmission Control Protocol/Internet Protocol (TCP/ 
I P) connection is then established between the client 1 2 
and a server 14 hosting the sen/Ice. The client request 
Is delivered to the server 14 In this example via a TCP/ 
IP connection over a first network 16, a router 18 and a 
second network 20. The first network 16 may be a wide 
area communication network such as the Internet, while 
the second network 20 may be an Ethemet or other type 
of local area network (LAN) interconnecting server 14 
with other servers in a server cluster. The router 1 8, also 
referred to as a gateway, performs a relaying function 
between the first and second networks which Is trans- 
parent to the client 12. 



The client request Is generated by a web browser 
or other application-layer program operating in an appli- 
cation layer 22-1 of the client 12. and Is responded to 
by a file transfer system or other program in an applica- 
5 tion layer 22-2 of the server 1 4. The requested network 
service may be designated by a Unifonn Resource Lo- 
cator (URL) which includes a domain name identifying 
the server 14 or a corresponding server cluster hosting 
the service. The application-level program of the client 
10 12 Initiates the TCP/IP connection by requesting a local 
or remote Domain Name Service (DNS) to map the serv- 
er domain name to an IP address. The TCP and IP pack- 
et routing functions In client 12 and server 14 are pro- 
vided in respective TCP layers 24-1 . 24-2 and IP layers 
IS 26-1 . 26-2. The TCP and IP layers are generally asso- 
ciated with the transport and network layers, respective- 
ly, of the well-known Open Systems Interconnection 
(OSl) model. The TCP layers 24-1, 24-2 process TCP 
packets of the client request and server response. The 
20 TCP packets each include a TCP header identifying a 
port number of the TCP connection between the client 
12 and server 14. The IP layers 26-1, 26-2 process IP 
packets formed from the TCP packets of the TCP layers. 
The IP packets each include an IP header identifying an 
26 IP address of the TCP/IP connection between the client 
12 and server 14. 

The IP address for a given network service may be 
determined, as noted above, by the client accessing a 
conventional DNS. The IP layer 26-1 of the client 1 2 us- 
30 es the resulting IP address as a destination address in 
the IP packet headers of client request packets. The IP 
address together with the TCP port number provide the 
complete transport address for the HTTP server proc- 
ess. The client 12 and server 14 also Include data link 
35 and physical layers 28-1 for performing framing and oth- 
er operations to configure client request or reply packets 
for transmission over the networks 1 6 and 20. The router 
18 Includes data link and physical layers 28-3 for con- 
verting client request and server reply packets to IP for- 
40 mat. and an I P layer 26-3 for performing packet routing 
based on IP addresses. The server 14 responds to a 
given client request by supplying the requested Informa- 
tion over the established TCP/IP connection in a number 
of reply packets. The TCP/IP connection Is then closed. 
45 There are many known techniques for distributing 
HTTP client requests to a cluster of servers. FIGS. 2 
and 3 Illustrate server-side single-IP-address image ap- 
proaches which present a single IP address to the cli- 
ents. An example of this approach Is the TCP router ap- 
50 proach described in D.M. Dias, W. Kish. R. Mukherjee 
and R. Tewari. "A Scalable and Highly Available Web 
Server," Proceedings of COMPCON '96, pp. 85-92, 
1996. which is Incorporated by reference herein. FIG. 2 
Illustrates the TCP router approach in which a client 1 2 
55 establishes a TCP/IP connection over Internet 30 with 
a server-side router 32 having an IP address RA. The 
router 32 is connected via a LAN 36 to a server cluster 
34 Including N servers 1 4-i. I = 1 . 2. ... N, having respec- 
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live IP addresses SI, S2. ... SN. Each server of the clus- 
ter 34 generally provides access to the same set of con- 
tents, and the contents nnay be replicated on a local disk 
of each server, shared on a network file system, or 
served by a distributed file system. 

The single-address image is achieved by publiciz- 
ing the address RA of the server-side router 32 to the 
clients via the DNS. The client 12 therefore uses RA as 
a destination IP address in its request. The request is 
directed to the router 32. which then dispatches the re- 
quest to a selected server 14-k of server cluster 34 
based on load characteristics, as indicated by the 
dashed line connecting client 1 2 to server 1 4-k via router 
32. The router 32 performs this dispatching function by 
changing the destination IP address of each incoming 
IP packet of a given client request from the router ad- 
dress RA to the address Sk of selected server 14-k. The 
selected server 14-k responds to the client request by 
sending reply packets over the established TCP/IP con- 
nection, as indicated by the dashed line connecting 
sen/er 14-k to client 12. In order to make the TCP/IP 
connection appear seamless to the client 1 2, the select- 
ed server 1 4-k changes the source IP address in its reply 
packets from its address Sk to the router address RA. 
The advantages of this approach are that it does not in- 
crease the number of TCP connections, and it Is totally 
transparent to the clients. However, since the above- 
noted source IP address change is performed at the IP 
layer in a given server, the kernel code of every server 
in the cluster has to be modified to implement this mech- 
anism A proposed hybrid of the DNS approach and the 
TCP router approach, in which a DNS sen/er selects one 
of several clusters of servers using a round-robin tech- 
nique, suffers from the same problem. 

FIG. 3 illustrates a server-side single-address im- 
age approach known as network address translation, as 
described in greater detail in E. Anderson, D. Patterson 
and E. Brewer, "The Magicrouter. an Application of Fast 
Packet Interposing," Symposium on Operating Systems 
Design and Implementation. OSDl, 1996. <http://www. 
cs.berkeley.edu/~eanders/magicrouter/ osdi96-mr- 
submission.ps>. and Cisco Local Director, <http://www. 
cisco.com/warp/public/751/lodir/index.html>. which are 
incorporated by reference herein. As in the TCP router 
approach of FIG. 2. the client 1 2 uses the router address 
RA as a destination IP address in a client request, and 
the router 32 dispatches the request to a selected server 
14-k by changing the destination IP address of each in- 
coming request packet from the router address RA to 
the address Sk of selected server 1 4-k. However. In the 
network address translation approach, the source IP ad- 
dresses In the reply packets from the selected server 
14-k.are changed not by server 14-k as in FIG. 2, but 
are instead changed by the router 32. The reply packet 
flow Indicated by a dashed line in FIG. 2 thus passes 
from server 14-k to client 12 via router 32. 

Compared to the TCP router approach of FIG. 2, 
network address translation has the advantage of sen/er 



transparency. That is. no specific changes to the kernel 
code of the servers are required to implement the tech- 
nique. However, both the TCP router and network ad- 
dress translation approaches require that the destina- 

5 tion address in a request packet header be changed to 
a server address so that the server can accept the re- 
quest. These approaches also require that the source 
address in a reply packet header be changed to the rout- 
er address so that the client can accept the reply. These 

70 changes introduce additional processing overhead and 
unduly complicate the packet delivery process. In addi- 
tion, because of the address changes, the above-de- 
scribed single-address image approaches may not be 
suitable for use with protocols that utilize IP addresses 

IS within an application, such as that described in K. 
Egevang and R Francis, "The IP Network Address 
Translator." Network Working Group, RFC 1631. <http: 
//www. safety net/ rfc1631 .txt>, which is incorporated by 
reference herein. Furthermore, in both the TCP router 

20 and network address translation approaches, the router 
32 needs to store an IP address mapping for every IP 
connection. Upon receiving an incoming packet associ- 
ated with an existing TCP connection, the router has to 
search through all of the mappings to determine which 

25 server the packet should be forwarded to. The router 
itself may therefore become a bottleneck under heavy 
load conditions, necessitating the use of a more com- 
plex hardware design, as in the above-cited Cisco Local 
Director. 

30 It is therefore apparent that a need exists for im- 
proved techniques for hosting a network service on a 
cluster of servers while presenting a single-address im- 
age to the clients, without the problems associated with 
the above-described conventional approaches. 

35 

Summary of the Invention 

The present invention provides methods and appa- 
ratus for hosting a network service on a cluster of serv- 

40 era. All of the servers in a server cluster configured in 
accordance with the invention may be designated by a 
single cluster address which is assigned as a secondary 
address to each server. AH client requests for a web srte 
or other network sen/ice associated with the cluster ad- 

45 dress are sent to the server cluster, and a dispatching 
mechanism is used to ensure that each client request is 
processed by only one server in the cluster. The dis- 
patching may be configured to operate without increas- 
ing the number of TCP/IP connections required for each 

so client request. The invention evenly distributes the client 
request load among the various servers of the cluster, 
masks the failure of any server or servers of the cluster 
by distributing client requests to the remaining sen/ers 
without bringing down the service, and permits addition- 

55 al servers to be added to the cluster without bringing 
down the service. Although well-suited for use in hosting 
web site services, the techniques of the present inven- 
tion may also be used to support a wide variety of other 
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server applications. 

I n an exemplary embodiment of the invention, a net- 
work service is hosted by a server cluster In which each 
server Includes a primary IP address and a secondary 
IP address. A common cluster address Is assigned as 
the secondary IP address for each of the sen/ers. The 
cluster address may be an IP address which does not 
correspond to a primary I P address of any of the sen/ers. 
In UNIX-based sen/ers, the cluster address may be as- 
signed as the secondary address for a given server us- 
ing an ifconfig alias option. If a given server Includes 
multiple network Interface cards, the cluster address 
may be assigned to one of the network Interface cards 
using a UNIX ifconfig command without the a//as option, 
or other similar technique. A router is coupled to a local 
network of the server cluster and Is also coupled via the 
Internet to a client. The router receives client requests 
from the Internet, and uses a dispatching technique to 
direct client requests having the cluster address as a 
destination. The client requests are dispatched such 
that each of the requests is processed by only one of 
the servers in the cluster. The dispatching function may 
be based on the result of applying a hash function to an 
IP address of the given client. A suitable hash function 
may be determined using an analysis of a distribution of 
client IP addresses In an access log associated with one 
or more of the servers. In the event that a server has 
failed, the hash function may be reapplied to the client 
IP address to identify another server. 

Two illustrative dispatching techniques for providing 
a single-address image for a server cluster In accord- 
ance with the invention include routing-based dispatch- 
ing and broadcast-based dispatching. In the routing- 
based technique, a dispatcher Is coupled to the router 
and to a local network of the server cluster. The router 
directs client requests having the cluster address to the 
dispatcher, and the dispatcher selects a particular one 
of the servers to process a given client request based 
on the result of applying a hash function to the client 
address. In the broadcast-based technique, the router 
broadcasts client requests having the cluster address to 
each of the servers over the local network of the sen/er 
cluster. Each of the servers implements a filtering rou- 
tine to ensure that each client request Is processed by 
only one of the servers. The filtering routine may involve 
applying a hash function to the client IP address asso- 
ciated with a given client request, and comparing the 
result to a server Identifier to determine whether that 
server should process the client request. 

The techniques of the present invention provide fast 
dispatching and can be Implemented with reduced cost 
and complexity. The techniques are suitable for use In 
TCP/IP networks as well as networks based on a variety 
of other standards and protocols. Unlike the convention- 
al single-address image approaches, the present Inven- 
tion does not require that a destination address In a re- 
quest packet header be changed to a server address so 
that the server can accept the request, or that a source 



address In a reply packet header be changed to the rout- 
er address so that the client can accept the reply In ad- 
dition, the router need not store an IP address mapping 
for every IP connection, nor Is it required to search 

5 through such a mapping to determine which server a 
packet should be forwarded to. The router itself will 
therefore not become a bottleneck under heavy load 
conditions, and special router hardware designs are not 
required. These and other features and advantages of 

10 the present Invention will become more apparent from 
the accompanying drawings and the following detailed 
description. 

Brief Description of the Drawings 

75 

FIG. 1 Is a block diagram Illustrating a conventional 
client-server interconnection in accordance with the 
TCP/IP standard; 

FIG, 2 illustrates a prior art TCP router technique 
20 for hosting a network sen/ice on a cluster of servers; 
FIG. 3 illustrates a prior art network address trans- 
lation technique for hosting a network service on a 
cluster of sen/ers; 

FIG. 4 illustrates a technique for hosting a network 
25 service on a cluster of sen/ers using routing-based 
dispatching in accordance with an exemplary em- 
bodiment of the invention; and 
FIG. 5 illustrates a technique for hosting a network 
service on a cluster of servers using broadcast- 
30 based dispatching In accordance with another ex- 
emplary embodiment of the invention. 

Detailed Description of the Invention 

35 The present Invention will be illustrated below in 
conjunction with exemplary client/ sen/er connections 
established over the Internet to a server cluster using 
the Transmission Control Protocol/Internet Protocol 
(TCP/IP) standard. It should be understood, however, 
40 that the invention Is not limited to use with any particular 
type of network or network communication protocol. The 
disclosed techniques are suitable for use with a wide 
variety of other networks and protocols. The term "serv- 
er cluster" as used herein refers to a group or set of serv- 
es ers interconnected or otherwise configured to host a net- 
work sen/ice. The terms "cluster address" and "single- 
address image" refer generally to an address associat- 
ed with a group of sen/ers configured to support a net- 
work sen/ice or services. A "ghost IP address" is one 
so type of cluster address In the form of an IP address 
which is not used as a primary address for any sen/er 
of a given sen/er cluster The term "network sen/ice" Is 
intended to include web sites, Internet sites and data 
delivery sen/Ices, as well as any other data transfer 
55 -mechanism accessible by a client over a network. The 
term "client request" refers to a communication from a 
client which initiates the network sen/ice. A given client 
request may include multiple packets or only a single 
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packet, depending on the nature of the request. 

The present invention provides an improved single- 
address image approach to distributing client requests 
to sen/ers of a server cluster. In a preferred embodi- 
ment, the invention allows all servers of a server cluster 
to share a single common IP address as a secondary 
address. The secondary address is also referred to 
herein as a cluster address, and may be established us- 
ing an ifconfig alias option available on most UNIX- 
based systems, or similar techniques available on other 
systems. The cluster address may be publicized to cli- 
ents using the above-noted Domain Name Service 
(DNS) which translates domain names associated with 
Uniform Resource Locators (URLs) to IP addresses. All 
client requests to be directed to a service hosted by the 
sen/er cluster are sent to the single cluster address, and 
dispatched to a selected one of the servers using rout- 
ing-based or broadcast-based dispatching techniques 
to be described in greater detail below. Once a server 
is selected, future request packets associated with the 
same client request may be directed to the same server. 
All other communications within the sen/er cluster may 
utilize primary IP addresses of the servers. 

The above-noted ifconfig alias option is typically 
used to allow a single server to serve more than one 
domain name. For example, the ifconfig a//as option al- 
lows a single sen/er to attach multiple IP addresses, and 
thus multiple domain names, to a single network inter- 
face, as described in "Two Servers. One Interface" <ht- 
tp://www.thesphere.com/"-dlp/TwoServers/>, which is 
incorporated by reference herein. Client requests direct- 
ed to any of the multiple domain names can then be 
sen/iced by the same server The server determines 
which domain name a given request is associated with 
by examining the destination address in the request 
packet. The present invention utilizes the ifconfig alias 
option to allow two servers to share the same IP ad- 
dress. Normally, two sen/ers cannot share the same IP 
address because such an arrangement would cause 
any packet destined for the shared address to be ac- 
cepted and responded to by both servers, confusing the 
client and possibly leading to a connection reset. There- 
fore, before a sen/er is permitted to attach a new IP ad- 
dress to its network interface, a check may be made to 
ensure that no other server on the same local area net- 
work (LAN) is using that IP address. If a duplicate ad- 
dress is found, both servers are informed and warnings 
are issued. The routing-based or broadcast -based dis- 
patching of the present invention ensures that every 
packet is processed by only one sen/er of the cluster, 
such that the above-noted warnings do not create a 
problem. 

An alternative technique for assigning a secondary 
address to a given sen/er of a sen/er cluster in accord- 
ance with the invention involves configuring the given 
server to include multiple network interface cards such 
that a different address can be assigned to each of the 
network interface cards. For example, in a UNIX-based 



system, conventional ifconfig commands may be used, 
without the above-described a//as option, to assign a pri- 
mary IP address to one of the network interface cards 
and a secondary IP address to another of the network 
5 interface cards. The secondary IP address is also as- 
signed as a secondary IP address to the remaining serv- 
ers in the cluster, and used as a cluster address for di- 
recting client requests to the cluster. 

The exemplary embodiments of the present inven- 
70 tion to be described below utilize dispatching techniques 
in which servers are selected based on a hash value of 
the client IP address. The hash value may be generated 
by applying a hash function to the client IP address, or 
by applying another suitable function to generate a hash 
IS value from the client IP address. For example, given N 
servers and a packet from a client having a client ad- 
dress CA, a dispatching function in accordance with the 
invention may compute a hash value kas CA mod (N- 
1) and select server k\o process the packet. This en- 
20 sures that all request or reply packets of the same TCP/ 
IP connection are directed to the same server in the 
server cluster A suitable hash function may be deter- 
mined by analyzing a distribution of client IP addresses 
in actual access logs associated with the sen/ers such 
25 that client requests are approximately evenly distributed 
to all sen/ers. When a server in the cluster fails, the sub- 
set of clients assigned to that sen/er will not be able to 
connect to it. The present invention addresses this po- 
tential problem by dynamically modifying the dispatch- 
30 ing function upon detection of a sen/erfailure. If the hash 
value of a given client IP address maps to the failed serv- 
er, the client IP address is rehashed to map to a non- 
failed sen/er, and the connections of the remaining cli- 
ents are not affected by the failure. 
35 FIG. 4 illustrates a routing-based dispatching tech- 
nique in accordance with the present invention. Solid 
lines indicate network connections, wrtiile dashed lines 
show the path of an exemplary client request and the 
corresponding reply. A client 52 sends a client request 
40 to a server cluster 54 including N servers 54-i, i = 1,2. ... 
N having IP addresses SI . S2, ... SN and interconnect- 
ed by an Ethernet or other type of LAN 56. The client 
request is formulated in accordance with the above-de- 
scribed HTTP protocol, and may include a URL with a 
45 domain name associated with a web site or other net- 
work service hosted by the server cluster 54. The client 
accesses a DNS to determine an IP address for the do- 
main name of the sen/ice, and then uses the IP address 
to establish a TCP/IP connection for communicating 
so with one of the servers 54-i of the server cluster 54. In 
accordance with the invention, a "ghost' IP address is 
publicized to the DNS as a cluster address for the sen/er 
cluster 54. The ghost IP address Is selected such that 
none of the sen/ers 54-i of cluster 54 has that IP address 
S5 as its primary address. Therefore, any request packets 
directed to the ghost IP address are associated with cli- 
ent requests for the service of the single-address image 
cluster 54. The use of the ghost IP address thus distin- 
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guishes a network service hosted by the cluster from 
activities of the servers 54-i which utilize the primary 
server addresses, and prevents interference with these 
primary address activities. 

The client 52 uses the ghost 1 P address as a cluster 
address for directing its request to the server cluster 54. 
The request is directed over Internet 60 to a router 62 
having an I P address RA. The router 62 includes a rout- 
ing table having an entry or record directing any incom- 
ing request packets having the ghost IP address to a 
dispatcher 64 connected to the LAN 56. The dispatcher 
64 includes an operating system configured to run in a 
router mode, using a routing algorithm which performs 
the dispatching described herein. In alternative embod- 
iments, the functions of the dispatcher 64 could be in- 
corporated into the router 62 in order to provide addi- 
tional efficiency improvements. Each of the servers 54-i 
of the cluster 54 utilizes the above-described ifconfig ali- 
as option to set the ghost IP address as their secondary 
address. As noted above, this technique for setting a 
secondary address for each of the servers 54-i generally 
does not require any alteration of the kernel code run- 
ning on the servers. In alternative embodiments, one or 
more of the servers 54-i may be configured to include 
multiple network interface cards, as previously noted, 
such that a different address can be assigned to each 
of the network interface cards of a given sen/er using a 
UNIX ifconfig command or other similar technique. 

The router 62 routes any packets having the ghost 
IP address to the dispatcher 64 in accordance with the 
above-noted routing table record. The dispatcher 64 
then applies a hash function to the client IP address in 
a given request packet to determine which of the senders 
54-i the given packet should be routed to. In the example 
illustrated in FIG. 4, the dispatcher 64 applies a hash 
function to the IP address of client 52 and determines 
that the corresponding request packet should be routed 
to server 54-2 at IP address S2. The dispatcher 64 then 
routes the request packet to the server 54-2 over LAN 
56, as indicated by the dashed line, using the primary 
address S2 of server 54-2 to distinguish it from the other 
servers of cluster 54. After the network interface of serv- 
er 54-2 accepts the packet, all higher level processing 
may be based on the ghost IP address because that is 
the destination address in the packet IP header and pos- 
sibly in the application-layer packet contents. After 
processing the request, the server 54-2 replies directly 
to the client 52 via router 62 over the established TCP/ 
IP connection, using the ghost IP address, and without 
passing through the dispatcher 64. 

It should be noted that wtien a request packet des- 
tined for the ghost IP address is received by the network 
interface of dispatcher 64 and placed back onto the 
same network interface for delivery to one of the servers 
54-i over LAN 56, it may cause an Internet control mes- 
sage protocol (iChAP) host redirect message to be sent 
to the router 62. This ICMP message is designed to di- 
rect the router 62 to update its routing table such that 



any future packets having the ghost IP address can by- 
pass the dispatcher 64 and go directly to the destination 
server, as described in greater detail in W.R. Stevens, 
TCP/IP Illustrated. Vol. 1, Ch. 6, pp. 69-83, which is in- 
5 corporated by reference herein. However, this effect is 
undesirable in the routing technique of FIG. 4 because 
the dispatcher 64 performs the server selection process 
as previously described- It therefore may be necessary 
to suppress the \Ot\/iP host redirect message for the 
JO ghost IP address by, for example, removing or altering 
the corresponding operating system code In the dis- 
patcher. In the above-mentioned alternative embodi- 
ments in which the dispatching function is implemented 
within the router 62, the ICMP redirect message is not 
IS generated and therefore need not be suppressed. An- 
other potential problem may arise when a reply packet 
is sent back to the client 52 from the selected server 
54-2 with the ghost IP address, in that it may cause the 
router 62 to associate, in its Address Resolution Proto- 
20 col (ARP) cache, the ghost IP address with the LAN ad- 
dress of the selected server. The operation of the ARP 
cache is described in greater detail in W.R. Stevens, 
TCP/IP Illustrated, Vol. 1 , Chs. 4 and 5. pp. 53-68, which 
is incorporated by reference herein. The illustrative em- 
25 bodiment of FIG. 4 avoids this problem by automatically 
routing the request packets to the dispatcher 64, and 
then dispatching based on the server primary IP ad- 
dress, such that the router ARP cache is not used. 
FIG. 5 illustrates a broadcast-based dispatching 
30 technique in accordance with the present invention. 
Again, solid lines indicate network connections, while 
dashed lines show the path of an exemplary client re- 
quest and the corresponding reply. As in the FIG. 4 rout- 
ing-based embodiment, client 52 sends a client request 
35 to server cluster 54 including N servers 54-i. i = 1 , 2, ... 
N connected to LAN 56 and having IP addresses SI, 
S2. ... SN. The client 52 uses the above-described ghost 
address as a cluster address for directing its request to 
the sender cluster 54. The request is directed over Inter- 
ne net 60 to a router 70 having an I P address RA. The rout- 
er 70 broadcasts any incoming request packets having 
the ghost IP address to the LAN 56 interconnecting the 
servers 54-i of the server cluster 54, such that the re- 
quest packet is received by each of the servers 54-i. 
45 Each of the servers 54-i of the cluster 54 imple- 
ments a filtering routine in order to ensure that only one 
of the servers 54-i processes a given client request. The 
filtering routine may be added to a device driver of each 
of the senders 54-i. In an exemplary implementation. 
50 each of the senders 54-i is assigned a unique identifica- 
tion (ID) number. The filtering routine of a given server 
54-i computes a hash value of the client IP address and 
compares it to the ID number of the given server. If the 
hash value and the ID number do not nnatch, the filtering 
55 routine of the given server rejects the packet. If the hash 
value and the ID number do match, the given sen/er ac- 
cepts and processes the packet as if it had received the 
packet through a conventional IP routing mechanism. In 



6 



11 



EP 0 865 180 A2 



12 



the illustrative example of FIG. 5, a packet associated 
with request from client 52 is broadcast by the router 70 
to each of the servers 54-i of the server cluster 54 over 
the LAN 56 as previously noted. The tittering routine of 
server 54-2 generates a hash value of the client IP ad- 
dress which matches the unique ID number associated 
with sen/er 1 4-2, and server 1 4-2 therefore accepts and 
processes the packet. The filtering routines of the N-1 
other servers 54-i each indicate no match between the 
client IP address and the corresponding server ID 
number, and therefore discard the broadcast packet. 
The reply packets are sent back to the client 52 via rout- 
er 70, as indicated by the dashed lines, using the ghost 
IP address. 

The broadcast-based dispatching technique of FIG. 
5 may be implemented using a permanent ARP entry 
within the router 70, to associate the ghost IP address 
with the Ethernet or other local network broadcast ad- 
dress associated with LAN 56 of the cluster 54. A po- 
tential problem is that any reply packet from a selected 
server appears to be coming from the ghost I P address, 
and may therefore cause the router 70 to overwrite the 
entry in its ARP cache such that the ghost IP address is 
associated with the LAN address of the selected sen/er. 
This potential problem may be addressed by setting up 
a routing table entry in the router 70 to direct all packets 
having a ghost tP destination address to a second ghost 
IP address which is a legal subnet address in the LAN 
56 of the server cluster 54 but is not used by any sen/er. 
In addition, an entry is inserted in the ARP cache of the 
router 70 to associate the second ghost IP address with 
the broadcast address of the LAN 56. When the router 
70 routes a packet to the second ghost IP address, it 
will then actually broadcast the packet to each of the 
servers 54-i of the cluster 54. Since no reply packet is 
sent from the second ghost IP address, the correspond- 
ing entry of the router ARP cache will remain un- 
changed. Another potential problem is that some oper- 
ating systems, such as the NetBSD operating system, 
do not allow a TCP packet to be processed if it is re- 
ceived from a broadcast address. This potential problem 
may be avoided by a suitable modification to the broad- 
cast address in the LAN packet header attached to the 
packet. 

The routing-based and broadcast-based dispatch- 
ing techniques described in conjunction with FIGS. 4 
and 5 above have been implemented on a cluster of Sun 
SPARC workstations. The NetBSD operating system, 
as described in NetBSD Project, <http:/www. NetBSD. 
org>, was used to provide any needed kernel code mod- 
ifications. The dispatching overhead associated with 
both techniques is minimal because the packet dis- 
patching is based on simple IP address hashing, without 
the need for storing or searching any address-mapping 
information. In the routing-based dispatching technique, 
the additional routing step in the dispatcher 64 typically 
adds a delay of about 1 to 2 msecs to the TCP round- 
trip time of each incoming request packet. A study in W. 



R. Stevens, TCP/IP Illustrated, Volume 3, pp. 185-186. 
which is incorporated by reference herein, indicates that 
the median TCP round-trip time is 187 msecs. The ad- 
ditional delay attributable to the routing-based dispatch- 

5 ing is therefore negligible. Although the additional rout- 
ing step for every request packet sent to the ghost IP 
address may increase the traffic in the LAN of the server 
cluster, the size of a request in many important applica- 
tions is typically much smaller than that of the corre- 

10 spending response, which is delivered directly to the cli- 
ent without the additional routing. In the broadcast- 
based dispatching technique, the broadcasting of each 
incoming request packet on the LAN of the server clus- 
ter does not substantially increase network traffic. Al- 

is though a hash value is computed for each incoming 
packet having the ghost IP destination address, which 
increases the CPU load of each sen/er. this additional 
computation overhead is negligible relative to the corre- 
sponding communication delay. 

20 Both the routing-based and broadcast-based dis- 
patching techniques of the present invention are scala- 
ble to support relatively large numbers of servers. Al- 
though the dispatcher in the routing-based technique 
could present a potential bottleneck in certain applica- 

25 tions. a study in the above-cited D.M. Dias et al. refer- 
ence indicates that a single dispatcher can support up 
to 75 server nodes, which is sufficient support for many 
practical systems. The number of servers supported 
may be even higher with the present invention given that 

30 the routing-based dispatching functions described here- 
in are generally simpler than those in the D.M. Dias et 
al. reference. It should also be noted that additional scal- 
ability can be obtained by combining the routing-based 
dispatching of the present invention with a DNS round- 

35 robin technique. For example, a DNS sen/er may be 
used to map a domain name to one of a number of dif- 
ferent ghost IP addresses belonging to different server 
clusters using a round-robin technique. In the broad- 
cast-based dispatching technique, there is no potential 

40 dispatching bottleneck, although the device drivers or 
other portions of the servers may need to be modified 
to provide the above-described filtering routines. 

The routing-based and broadcast-based dispatch- 
ing of the present invention can also provide load bal- 

45 ancing and failure handling capabilities. For example, 
given N sen/ers and a packet from client address CA, 
the above-described routing-based dispatching function 
may compute a hash value k as CA mod (N- 1) and select 
server k\o process the packet. Mote sophisticated dis- 

so patching functions can also be used, and may involve 
analyzing the actual service access log to provide more 
effective load balancing. In order to detect failures, each 
server may be monitored by a watchdog daemon such 
as the watchd 6aemon described in greater detail in Y 
55 Huang and C. Kintala. "Software Implemented Fault Tol- 
erance: Technologies and Experience." Proceedings of 
the 23^** International Symposium on Fault-Tolerant 
Computing - FTCS, Toulouse, France, pp. 2-9, June 
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1993. which is incorporated by reference herein. When 
a server fails, the corresponding watchd daemon initi- 
ates a change of the dispatching function to mask the 
failure and rebalance the load. A systenn call interface 
may be implemented to allow the dispatching function 
to be changed while the servers remain on-line. In rout- 
ing-based dispatching, the watchd daemon may notify 
the dispatcher to change the dispatching function, while 
in broadcast-based dispatching, all servers may be no- 
tified to modify their filtering routines. For example, if a 
server /c fails, the new dispatching function may check 
to see if the hash value CA mod N equals k. If it does, 
a new hash value j = CA mod (N-1 ) Is computed. If j is 
less than k, the packet goes to server / Otherwise, the 
packet goes to server This technique does not affect 
the clients of non-failed servers, reassigns the clients of 
the failed server evenly to the remaining servers, and 
can be readily extended to handle multiple server fail- 
ures. Additional sen/ers can be added to the cluster 
without bringing down the service by changing the dis- 
patching function from CA mod N to CA mod (N+ 1). 

In routing-based dispatching, the dispatcher may 
become a single point of failure, and therefore should 
also be monitored by a wafcM daemon or other suitable 
failure monitoring mechanism. Upon detecting a failure, 
the watchd daemon may trigger a transfer of the dis- 
patching function from the primary dispatcher to a back- 
up dispatcher, and then direct the router to change the 
entry in its routing table such that future incoming re- 
quest packets are routed to the backup dispatcher. 
Since no mapping table is maintained by the primary dis- 
patcher, this approach is substantially stateless. Proper 
routing may be ensured by simply utilizing consistent 
routing functions in the primary and backup dispatchers, 
without the substantial additional costs associated with 
mapping -based approaches. 

The use of the ifconfig alias option or other similar 
technique to provide a single-address image for a server 
cluster provides a number of advantages over the con- 
ventional techniques described previously For exam- 
ple, it avoids the need to change the destination address 
in a request packet header so that a particular server 
can accept the request, and the need to change the 
source address in a reply packet header to the cluster 
address so that the client can accept the reply. With the 
single-address image approach of the present inven- 
tion, all servers can accept and respond to packets hav- 
ing the cluster address, so that the addresses in the re- 
quest and reply packet headers do not need to be mod- 
ified. Since the single-image approach of the present in- 
vention does not require alternation of the packet ad- 
dresses, it is suitable for use with a wide variety of pro- 
tocols, including those protocols which utilize IP ad- 
dresses within an application program. In addition, the 
single-address image approach of the present invention 
does not require a router to store or to search through 
a potentially large number of IP address mappings in 
order to determine which cluster server should receive 



a request packet. The invention thus effectively re- 
moves the possibility that the router may become a bot- 
tleneck under heavy load conditions. 

The above-described embodiments of the invention 
5 are intended to be illustrative only Numerous alternative 
embodiments may be devised by those skilled in the art 
without departing from the scope of the following claims. 



10 Claims 

1. A method of routing client requests to a plurality of 
sen/ers configured to support a network service 
over a communication network, each of the servers 
IS having a primary address, the method comprising 
the steps of: 

assigning a common address as a secondary 
address for each of the plurality of servers; and 
20 processing client requests directed to the com- 

mon address such that each of the requests is 
processed by a particular one of the plurality of 
servers. 

25 2. The method of claim 1 wherein the network utilizes 
a TCP/IP protocol and the primary and secondary 
addresses are primary and secondary IP address- 
es, respectively 

30 3. The method of claim 2 wherein the common ad- 
dress is an IP address which does not correspond 
to a primary IP address of any of the plurality of serv- 
ers. 

35 4. The method of any of the preceding claims wherein 
at least one of the plurality of servers is a UNIX- 
based server including multiple network interface 
cards, and the assigning step includes assigning 
the common address for the at least one server us- 

40 ing an /fcon//g command. . 

5. The method of any of claims 1 to 3 wherein the plu- 
rality of servers are UNIX-based senders, and the 
assigning step includes assigning the common ad- 

45 dress utilizing an ifconfig alias option for at least a 
subset of the plurality of sen/ers. 

6. The method of any of the preceding claims wherein 
the processing step includes the step of dispatching 

so a request of a given client to one of the plurality of 
servers based on application of a hash function to 
an IP address of the given client. 

7. The method of claim 6 wherein the hash function is 
55 determined based on an analysis of a distribution 

of client IP addresses in an access log associated 
with one or more of the servers. 
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8. The method o1 claim 6 wherein the dispatching step 
includes reapplying the hash function to the client 
IP address to identify another servers if a server 
identified as a results of a previous application of 
the hash function has failed. 

9. The method of any of claims 1 to 5 wherein the 
processing step includes the steps of: 

routing client requests directed to the common 
address to a dispatcher connected to a local 
network associated with the plurality of servers; 
and 

selecting a particular one of the servers to proc- 
ess a given client request based on application 
of a hash function to a corresponding client ad- 
dress in the dispatcher. 

10. The method of any of claims 1 to 5 wherein the 
processing step includes the steps of: 

broadcasting a given client request directed to 
the common address to each of the plurality of 
servers over a local network associated with 
the servers; and 

implementing a filtering routine in each of the 
plurality of servers so that the given client re- 
quest is processed by only one of the servers. 

11. The method of claim 10 wherein the implementing 
step includes the steps of: 

applying a hash function to a client IP address 
associated with the given client request; and 
comparing the result of the applying step to an 
identifier of a particular server to determine 
whether that server should process the given 
client request. 

1 2. An apparatus for routing client requests to a plurality 
of servers configured to support a network service 
over a communication network, each of the servers 
having a primary address, the apparatus compris- 
ing: 

means for assigning a common address as a 
secondary address for each of the plurality of 
servers; and 

means for processing client requests directed 
to the common address such that each of the 
requests is processed by a particular one of the 
plurality of servers. 

13. The apparatus of claim 12 wherein the processing 
means is operative to dispatch a request of a given 
client to one of the plurality of servers based on ap- 
plication of a hash function to an IP address of the 
given client. 



14. The apparatus of claim 13 wherein the hash func- 
tion is determined based on an analysis of a distri- 
bution of client IP addresses In an access log asso- 
ciated with one or more of the servers. 

5 

15. The apparatus of claim 13 wherein the processing 
means Is further operative to reapply the hash func- 
tion to the client IP address to identify another serv- 
er If a server identified as a result of a previous ap- 

10 plication of the hash function has failed. 

1 6. The apparatus of any of claims 1 2 to 1 5 wherein the 
processing means further Includes a dispatcher 
connected to a local network associated with the 

is plurality of servers, wherein the dispatcher is oper- 
ative to receive client requests directed to the com- 
mon address, and to select a particular one of the 
sen/ers to process a given client request based on 
application of a hash function to a corresponding 

20 client address. 

17. The apparatus of any of claims 1 2 to 15 wherein the 
processing means further includes: 

2S means for broadcasting a given client request 

directed to the common address to each of the 
plurality of servers over a local network associ- 
ated with the servers; and 
means for filtering the given client request in 

30 each of the plurality of servers so that the given 

client request is processed by only one of the 
servers. 

18. The apparatus of claim 17 wherein the filtering 
35 means is operative to apply a hash function to a cli- 
ent IP address associated with the given client re- 
quest, and to compare the result of the applying 
step to an identifier of a particular server to deter- 
mine whether that server should process the given 

40 client request. 

19. An apparatus for routing client requests for a net- 
work service over a communication network, the 
apparatus comprising: 

45 

a plurality of servers configured to support the 
network service, each of the servers having a 
primary address and a secondary address, 
wherein a common address is assigned as the 
so secondary address for each of the plurality of 

servers; and 

a router coupled to the servers and operative 
to route client requests directed to the common 
address such that each of the requests Is proc- 
55 essed by a particular one of the plurality of serv- 

ers. 

20. The apparatus of claim 19 wherein the router Is fur- 
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ther operative to route client requests such that a 
request of a given client is routed to one of the plu- 
rality of sen/ers based on application of a hash func- 
tion to an IP address of the given client. 

21. The apparatus of claim 20 wherein the hash func- 
tion Is determined based on an analysis of a distri- 
bution of client IP addresses in an access log asso- 
ciated with one or more of the servers. 

22. The apparatus of claim 20 wherein the hash func- 
tion is reapplied to the client IP address to identify 
another server if a server Identified as a result of a 
previous application of the hash function has failed. 

23. The apparatus of any of claims 19 to 22 further in- 
cluding a dispatcher coupled to the router and to a 
local network associated with the plurality of serv- 
ers, such that the router directs client requests hav- 
ing the common address to the dispatcher, and the 
dispatcher selects a particular one of the servers to 
process a given client request based on application 
of a hash function to a corresponding client ad- 
dress. 

24. The apparatus of any of claims 1 9 to 22 wherein the 
router isfurther operative to broadcast a given client 
request directed to the common address to each of 
the plurality of servers over a local network associ- 
ated with the servers, and further wherein each of 
the servers implements a filtering routine so that the 
given client request is processed by only one of the 
servers. 

25. The apparatus of claim 24 wherein the filtering rou- 
tine involves applying a hash function to a client IP 
address associated with the given client request, 
and comparing the result to an identifier of a partic- 
ular server to determine whether that server should 
process the given client request. 

26. The apparatus of any of claims 1 2 to 25 wherein the 
network utilizes a TCP/IP protocol and the primary 
and secondary addresses are primary and second- 
ary IP addresses, respectively. 

27. The apparatus of claim 26 wherein the common ad- 
dress is an IP address which does not correspond 
to a primary I P address of any of the plurality of serv- 
ers. 

28. The apparatus of any of claims 12 to 27 wherein at 
least one of the plurality of servers is a UNIX-based 
server including multiple network interface cards, 
and the common address is assigned for the at least 
one server using an ifconfig command. 

29. The apparatus of any of claims 1 2 to 27 wherein the 



plurality of servers are UNIX-based servers, and the 
common address is assigned as the secondary ad- 
dress of the plurality of servers by utilizing an ifcon- 
fig alias option for at least a subset of the plurality 
5 of servers. 
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